Skip to content

introduce a new integrated "codeflash optimize" command #384

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 34 commits into from
Jul 4, 2025

Conversation

misrasaurabh1
Copy link
Contributor

@misrasaurabh1 misrasaurabh1 commented Jun 26, 2025

PR Type

Enhancement, Tests


Description

  • Introduce integrated optimize CLI command

  • Add FunctionRanker class for ttX-based ranking

  • Extend tracer: generate replay test and invoke optimizer

  • Update tests: unit and end-to-end optimize flows


Changes diagram

flowchart LR
  CLI["codeflash optimize command"] --> TR["Tracer.trace code"]
  TR --> RT["Generate replay test"]
  RT --> FR["FunctionRanker.rerank_and_filter"]
  FR --> OPT["Optimizer.run_with_args"]
  OPT --> OUT["Optimization results"]
Loading

Changes walkthrough 📝

Relevant files
Enhancement
7 files
workload.py
Add sleep and heavy compute in SimpleModel.predict             
+10/-1   
function_ranker.py
New `FunctionRanker` for function profiling ranking           
+144/-0 
cli.py
Add `optimize` subcommand and CI flag parsing                       
+10/-1   
env_utils.py
Add `is_ci()` helper for CI environment detection               
+6/-0     
functions_to_optimize.py
Integrate trace path and rerank functions workflow             
+53/-3   
tracer.py
Support replay tests, static methods, optimization chaining
+51/-8   
profile_stats.py
Include `class_name` and normalize caller keys                     
+14/-2   
Configuration changes
2 files
config_consts.py
Define `DEFAULT_IMPORTANCE_THRESHOLD` constant                     
+1/-0     
pyproject.toml
Configure pytest warning filters                                                 
+6/-0     
Formatting
3 files
server.py
Fix import order and comma formatting in LSP server           
+7/-6     
server_entry.py
Change `setup_logging` return type signature                         
+4/-3     
posthog_cf.py
Clean up trailing comments and commas                                       
+2/-2     
Miscellaneous
1 files
pickle_patcher.py
Remove debug print in placeholder creation                             
+0/-2     
Tests
3 files
end_to_end_test_tracer_replay.py
Update traced count and coverage expectations                       
+2/-2     
end_to_end_test_utilities.py
Switch to `codeflash.main optimize` invocation                     
+10/-24 
test_function_ranker.py
Add unit tests for `FunctionRanker` methods                           
+172/-0 
Additional files
1 files
codeflash.trace [link]   

Need help?
  • Type /help how to ... in the comments thread for any questions about PR-Agent usage.
  • Check out the documentation for more information.
  • @misrasaurabh1 misrasaurabh1 marked this pull request as draft June 26, 2025 03:50
    Copy link

    github-actions bot commented Jun 26, 2025

    PR Reviewer Guide 🔍

    (Review updated until commit 9addd95)

    Here are some key observations to aid the review process:

    ⏱️ Estimated effort to review: 5 🔵🔵🔵🔵🔵
    🧪 PR contains tests
    🔒 Security concerns

    Sensitive information exposure:
    A live PostHog API key is committed in codeflash/telemetry/posthog_cf.py, which could be abused if extracted. Replace with secure retrieval (e.g., environment variable) and remove the embedded key.

    ⚡ Recommended focus areas for review

    Sensitive Info

    The PostHog project API key is hardcoded in source, risking exposure of credentials.

    _posthog = Posthog(project_api_key="phc_aUO790jHd7z1SXwsYCz8dRApxueplZlZWeDSpKc5hol", host="https://us.posthog.com")
    CLI Parsing

    Using parse_known_args with sys.argv reassignment may drop or reorder flags, leading to unexpected behavior for subcommands.

    args, unknown_args = parser.parse_known_args()
    sys.argv[:] = [sys.argv[0], *unknown_args]
    Performance Bottleneck

    The nested loops and sleep in SimpleModel.predict will significantly slow benchmarking and may skew optimization results.

    sleep(0.1) # can be optimized away
    for i in range(500):
        for x in data:
            computation = 0
            computation += x * i ** 2
            result.append(computation)
    return result

    Copy link

    github-actions bot commented Jun 26, 2025

    PR Code Suggestions ✨

    Latest suggestions up to 9addd95
    Explore these optional code suggestions:

    CategorySuggestion                                                                                                                                    Impact
    Security
    Load API key from environment

    Avoid committing sensitive API keys to source. Load the PostHog project_api_key from
    an environment variable at runtime and error out if it's missing.

    codeflash/telemetry/posthog_cf.py [24]

    -_posthog = Posthog(project_api_key="phc_aUO790jHd7z1SXwsYCz8dRApxueplZlZWeDSpKc5hol", host="https://us.posthog.com")
    +import os
     
    +project_api_key = os.getenv("POSTHOG_API_KEY")
    +if not project_api_key:
    +    logger.error("POSTHOG_API_KEY environment variable is not set")
    +    return
    +_posthog = Posthog(project_api_key=project_api_key, host="https://us.posthog.com")
    +
    Suggestion importance[1-10]: 9

    __

    Why: Hardcoding the PostHog key is a security risk; loading from POSTHOG_API_KEY ensures credentials aren’t committed and fails fast if not set.

    High
    Possible issue
    Support both schema versions

    The code now unpacks an extra class_name column which may not exist in the pstats
    schema, leading to unpack errors. Handle both schema versions by inspecting row
    length and defaulting class_name when missing.

    codeflash/tracing/profile_stats.py [26-36]

    -for (
    -    filename,
    -    line_number,
    -    function,
    -    class_name,
    -    call_count_nonrecursive,
    -    num_callers,
    -    total_time_ns,
    -    cumulative_time_ns,
    -    callers,
    -) in pdata:
    +for row in pdata:
    +    # support both old and new schemas
    +    if len(row) == 8:
    +        filename, line_number, function, call_count_nonrecursive, num_callers, total_time_ns, cumulative_time_ns, callers = row
    +        class_name = None
    +    else:
    +        filename, line_number, function, class_name, call_count_nonrecursive, num_callers, total_time_ns, cumulative_time_ns, callers = row
    +    loaded_callers = json.loads(callers)
    +    ...
    Suggestion importance[1-10]: 6

    __

    Why: The new class_name unpack assumes an updated schema and may break older traces; checking row length and defaulting class_name maintains backward compatibility.

    Low

    Previous suggestions

    Suggestions up to commit 4debe7e
    CategorySuggestion                                                                                                                                    Impact
    Possible issue
    Include optimize subcommand in invocation

    Include the "optimize" subcommand in the sys.argv array so the top-level CLI parser
    recognizes it and invokes the optimization phase correctly.

    codeflash/tracer.py [873]

    -sys.argv = ["codeflash", "--replay-test", str(replay_test_path)]
    +sys.argv = ["codeflash", "optimize", "--replay-test", str(replay_test_path)]
    Suggestion importance[1-10]: 8

    __

    Why: The CLI invocation omits the "optimize" subcommand causing the top-level parser to misinterpret "--replay-test", so including it is critical for correct workflow.

    Medium
    General
    Limit sys.argv mutation to optimize

    Restrict the mutation of sys.argv to only when the "optimize" command is active, so
    other subcommands receive their intended arguments and avoid unintended stripping.

    codeflash/cli_cmds/cli.py [73-74]

     args, unknown_args = parser.parse_known_args()
    -sys.argv[:] = [sys.argv[0], *unknown_args]
    +if args.command == "optimize":
    +    sys.argv[:] = [sys.argv[0], *unknown_args]
    Suggestion importance[1-10]: 6

    __

    Why: Restricting the global sys.argv rewrite to the optimize command prevents other subcommands from losing their arguments and improves correctness.

    Low
    Hoist tracer_main import

    Move the import of tracer_main to the module level to avoid repeated imports each
    time parse_args is called and clarify dependencies.

    codeflash/cli_cmds/cli.py [26-27]

    -trace_optimize = subparsers.add_parser("optimize", help="Trace and optimize a Python project.")
     from codeflash.tracer import main as tracer_main
     
    +trace_optimize = subparsers.add_parser("optimize", help="Trace and optimize a Python project.")
    +
    Suggestion importance[1-10]: 3

    __

    Why: Moving the import to module scope reduces repeated imports in parse_args, but the performance gain is minimal and context clarity is modest.

    Low

    @KRRT7 KRRT7 force-pushed the trace-and-optimize branch from 08464c4 to 0b4fcb6 Compare June 30, 2025 19:04
    codeflash-ai bot added a commit that referenced this pull request Jun 30, 2025
    …(`trace-and-optimize`)
    
    Here is an optimized rewrite of your `FunctionRanker` class.  
    **Key speed optimizations applied:**
    
    1. **Avoid repeated loading of function stats:**  
       The original code reloads function stats for each function during ranking (`get_function_ttx_score()` is called per function and loads/returns). We prefetch stats once in `rank_functions()` and reuse them for all lookups.
    
    2. **Inline and batch lookups:**  
       We use a helper to batch compute scores directly via a pre-fetched `stats` dict. This removes per-call overhead from attribute access and creation of possible keys inside the hot loop.
    
    3. **Minimal string operations:**  
       We precompute the two possible key formats needed for lookup (file:qualified and file:function) for all items only ONCE, instead of per invocation.
    
    4. **Skip list-comprehension in favor of tuple-unpacking:**  
       Use generator expressions for lower overhead when building output.
    
    5. **Fast path with `dict.get()` lookup:**  
       Avoid redundant `if key in dict` by just trying `dict.get(key)`.
    
    6. **Do not change signatures or behavior.  
       Do not rename any classes or functions.  
       All logging, ordering, functionality is preserved.**
    
    
    
    
    **Summary of performance impact:**  
    - The stats are loaded only once, not per function.
    - String concatenations for keys are only performed twice per function (and not redundantly in both `rank_functions` and `get_function_ttx_score`).
    - All lookup and sorting logic remains as in the original so results will match, but runtime (especially for large lists) will be significantly better.  
    - If you want, you could further optimize by memoizing scores with LRU cache, but with this design, dictionary operations are already the bottleneck, and this is the lowest-overhead idiomatic Python approach.  
    - No imports, function names, or signatures are changed.  
    
    Let me know if you need further GPU-based or numpy/pandas-style speedups!
    Copy link
    Contributor

    codeflash-ai bot commented Jun 30, 2025

    ⚡️ Codeflash found optimizations for this PR

    📄 13% (0.13x) speedup for FunctionRanker.rank_functions in codeflash/benchmarking/function_ranker.py

    ⏱️ Runtime : 1.84 milliseconds 1.62 milliseconds (best of 67 runs)

    I created a new dependent PR with the suggested changes. Please review:

    If you approve, it will be merged into this PR (branch trace-and-optimize).

    KRRT7 and others added 5 commits June 30, 2025 16:06
    …384 (`trace-and-optimize`)
    
    Here is an **optimized** version of your code, focusing on the `_get_function_stats` function—the proven performance bottleneck per your line profiing. 
    
    ### Optimizations Applied
    
    1. **Avoid Building Unneeded Lists**:  
       - Creating `possible_keys` as a list incurs per-call overhead.  
       - Instead, directly check both keys in sequence, avoiding the list entirely.
    
    2. **Short-circuit Early Return**:  
       - Check for the first key (`qualified_name`) and return immediately if found (no need to compute or check the second unless necessary).
    
    3. **String Formatting Optimization**:  
       - Use f-strings directly in the condition rather than storing/interpolating beforehand.
    
    4. **Comment Retention**:  
       - All existing and relevant comments are preserved, though your original snippet has no in-method comments.
    
    ---
    
    
    
    ---
    
    ### Rationale
    
    - **No lists** or unneeded temporary objects are constructed.
    - Uses `.get`, which is faster than `in` + lookup.
    - Returns immediately upon match.
    
    ---
    
    **This change will reduce total runtime and memory usage significantly in codebases with many calls to `_get_function_stats`.**  
    Function signatures and return values are unchanged.
    Copy link
    Contributor

    codeflash-ai bot commented Jul 1, 2025

    ⚡️ Codeflash found optimizations for this PR

    📄 51% (0.51x) speedup for FunctionRanker._get_function_stats in codeflash/benchmarking/function_ranker.py

    ⏱️ Runtime : 497 microseconds 330 microseconds (best of 51 runs)

    I created a new dependent PR with the suggested changes. Please review:

    If you approve, it will be merged into this PR (branch trace-and-optimize).

    …25-07-01T22.08.43
    
    ⚡️ Speed up method `FunctionRanker._get_function_stats` by 51% in PR #384 (`trace-and-optimize`)
    Copy link
    Contributor

    codeflash-ai bot commented Jul 1, 2025

    Copy link

    github-actions bot commented Jul 2, 2025

    Persistent review updated to latest commit 9addd95

    codeflash-ai bot added a commit that referenced this pull request Jul 3, 2025
    …imize`)
    
    Here's an optimized version of your Python program, focused on runtime and memory.
    
    **Key changes:**
    - Avoids reading the event file or parsing JSON if not needed.
    - Reads the file as binary and parses with `json.loads()` for slightly faster IO.
    - References the `"draft"` property directly using `.get()` to avoid possible `KeyError`.
    - Reduces scope of data loaded from JSON for less memory usage.
    - Caches the result of parsing the event file for repeated calls within the same process.
    
    
    
    - The inner try/except is kept close to only catching the specific case.
    - Results for each event_path file are cached in memory.
    - Exception handling and comments are preserved where their context is changed.
    - I/O and JSON parsing is only done if both env vars are set and PR number exists.
    Copy link
    Contributor

    codeflash-ai bot commented Jul 3, 2025

    ⚡️ Codeflash found optimizations for this PR

    📄 121% (1.21x) speedup for is_pr_draft in codeflash/code_utils/env_utils.py

    ⏱️ Runtime : 4.98 milliseconds 2.25 milliseconds (best of 94 runs)

    I created a new dependent PR with the suggested changes. Please review:

    If you approve, it will be merged into this PR (branch trace-and-optimize).

    @KRRT7
    Copy link
    Contributor

    KRRT7 commented Jul 3, 2025

    @misrasaurabh1 ready to review, can't tag you normally b/c you're the author

    @misrasaurabh1 misrasaurabh1 requested a review from aseembits93 July 3, 2025 22:48
    @misrasaurabh1 misrasaurabh1 enabled auto-merge July 3, 2025 22:51
    pyproject.toml Outdated
    [tool.pytest.ini_options]
    filterwarnings = [
    "ignore::pytest.PytestCollectionWarning",
    "ignore::pytest.PytestUnknownMarkWarning"
    Copy link
    Contributor

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    reasoning for PytestUnknownMarkWarning @KRRT7 ?

    Copy link
    Contributor

    @KRRT7 KRRT7 Jul 3, 2025

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    in codeflash/models/models.py we have a few Classes that are prefixed with Test, like

    @dataclass(frozen=True)
    class TestsInFile:
        test_file: Path
        test_class: Optional[str]
        test_function: str
        test_type: TestType

    which pytest complains about :
    Screenshot 2025-07-03 at 4 04 00 PM
    it is very noisy so I've disabled them

    Copy link
    Contributor

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    any examples of PytestUnknownMarkWarning specifically you've seen so far. I've not seen them yet.

    Copy link
    Contributor

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    it gets triggered on the pytests that have the skip_ci markers

    @misrasaurabh1 misrasaurabh1 disabled auto-merge July 4, 2025 03:01
    @misrasaurabh1 misrasaurabh1 merged commit 75810a3 into main Jul 4, 2025
    16 of 17 checks passed
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Projects
    None yet
    Development

    Successfully merging this pull request may close these issues.

    3 participants